Suffix tree-based approach to detecting duplications in sequence diagrams
نویسندگان
چکیده
Models are core artefacts in software development and maintenance. Consequently, quality of models, especially maintainability and extensibility, becomes a big concern for most non-trivial applications. For some reasons, software models usually contain some duplications. These duplications had better be detected and removed because the duplications may reduce maintainability, extensibility and reusability of models. As an initial attempt to address the issue, the author propose an approach in this study to detecting duplications in sequence diagrams. With special preprocessing, the author convert 2dimensional (2-D) sequence diagrams into an 1-D array. Then the author construct a suffix tree for the array. With the suffix tree, duplications are detected and reported. To ensure that every duplication detected with the suffix tree can be extracted as a separate reusable sequence diagram, the author revise the traditional construction algorithm of suffix trees by proposing a special algorithm to detect the longest common prefixes of suffixes. The author also probe approaches to removing duplications. The proposed approach has been implemented in DuplicationDetector. With the implementation, the author evaluated the proposed approach on six industrial applications. Evaluation results suggest that the approach is effective in detecting duplications in sequence diagrams. The main contribution of the study is an approach to detecting duplications in sequence diagrams, a prototype implementation and an initial evaluation.
منابع مشابه
Clone Detection in UML Sequence Diagrams Using Token Based Approach
Model Based Development appears to progress extremely in large scale software companies. UML (Unified Modeling Language) is raising as an utility in software development. In object oriented development, the complete details for the lifecycle are provided by UML. UML is a standard modeling language, so that it is used for analysis, design and implementation of software based systems. Clone detec...
متن کاملA Dynamic Approach to Weighted Suffix Tree Construction Algorithm
In present time weighted suffix tree is consider as a one of the most important existing data structure used for analyzing molecular weighted sequence. Although a static partitioning based parallel algorithm existed for the construction of weighted suffix tree, but for very long weighted DNA sequences it takes significant amount of time. However, in our implementation of dynamic partition based...
متن کاملUltra-fast Multiple Genome Sequence Matching Using GPU
In this paper, a contrastive evaluation of massively parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU. Besides suffix array only held approximately 20%∼30% of the space relative to suffix tree, the coalesced binary search and tile optimization make suffix array clearl...
متن کاملA new algorithm for detecting low-complexity regions in protein sequences
MOTIVATION Pair-wise alignment of protein sequences and local similarity searches produce many false positives because of compositionally biased regions, also called low-complexity regions (LCRs), of amino acid residues. Masking and filtering such regions significantly improves the reliability of homology searches and, consequently, functional predictions. Most of the available algorithms are b...
متن کاملCLUSEQ: Efficient and Effective Sequence Clustering
Analyzing sequence data has become increasingly important recently in the area of biological sequences, text documents, web access logs, etc. In this paper, we investigate the problem of clustering sequences based on their structural features. As a widely recognized technique, clustering has proven to be very useful in detecting unknown object categories and revealing hidden correlations among ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IET Software
دوره 5 شماره
صفحات -
تاریخ انتشار 2011